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Abstract 


Linear programming (LP) relaxations are a pop- 
ular method to attempt to find a most likely con- 
figuration of a discrete graphical model. If a so- 
lution to the relaxed problem is obtained at an 
integral vertex then the solution is guaranteed to 
be exact and we say that the relaxation is tight. 
We consider binary pairwise models and intro- 
duce new methods which allow us to demonstrate 
refined conditions for tightness of LP relaxations 
in the Sherali-Adams hierarchy. Our results in- 
clude showing that for higher order LP relax- 
ations, treewidth is not precisely the right way 
to characterize tightness. This work is primarily 
theoretical, with insights that can improve effi- 
ciency in practice. 


1 INTRODUCTION 


Discrete undirected graphical models are widely used in 
machine learning, providing a powerful and compact way 
to model relationships between variables. A key chal- 
lenge is to identify a most likely configuration of variables, 
termed maximum a posteriori (MAP) or most probable ex- 
planation (MPE) inference. There is an extensive literature 
on this problem from various communities, where it may 


be described as energy minimization (Kappes et al.}|2013) 


or solving a valued constraint satisfaction problem (VCSP, 
Schiex et al.||1995). 


Throughout this paper, we focus on the important class 
of binary pairwise models (Ising models), allowing arbi- 
trary singleton and edge potentials. For this class, the 
MAP problem is sometimes described as quadratic pseudo- 
Boolean optimization (QPBO, e.g. [1984). 
In these models, each edge potential may be characterized 
as either attractive (tending to pull its end variables toward 
the same value; equivalent to a submodular cost function) 
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or repulsive. Eaton and Ghahramani| (2013) showed that 


any discrete model may be arbitrarily well approximated 
by a binary pairwise model, though this may require a large 
increase in the number of variables. 


MAP inference is NP-hard for a general binary pairwise 
model, hence much work has attempted to identify settings 
where polynomial-time methods are feasible. We call such 
settings tractable and the methods efficient. 


In this work, we consider a popular approach which first 
expresses the MAP problem as an integer linear program 
(ILP) then relaxes this to a linear program (LP), see {2|for 
details. An LP attains an optimum at a vertex of the feasi- 
ble region: if the vertex is integral then it provides an exact 
solution to the original problem and we say that the LP is 
tight. If the LP is performed over the marginal polytope, 
which enforces global consistency 


2008), then the LP will always be tight but exponen- 
tially many constraints are required, hence the method is 


not efficient. The marginal polytope M is typically relaxed 
to the local polytope L2, which enforces consistency only 
over each pair of variables (thus yielding pseudomarginals 
which are pairwise consistent but may not correspond to 
any true global distribution), requiring only a number of 
constraints which is linear in the number of edges. 


LP relaxations are widely used in practice. However, the 
most common form LP+Lz often yields a fractional solu- 
tion, thus motivating more accurate approaches which en- 
force higher order cluster consistency (2011). 
A well-studied example is foreground-background image 
segmentation. If edge potentials are learned from data, 
because objects in the real world are contiguous, most 
edges will be attractive (typically neighboring pixels will 
be pulled toward the same identification of foreground or 
background unless there is strong local data from color 
or intensity). On the horses dataset considered by [Domke] 
(2013), LP+L%2 is loose but if triplet constraints are added, 
then the LP relaxation is often tight. Our work helps to 
explain and understand such phenomena. This has clear 
theoretical value and can improve efficiency in practice. 


Sherali and Adams| (1990) introduced a series of succes- 


sively tighter relaxations of the marginal polytope: for any 
integer r, L, enforces consistency over all clusters of vari- 
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ables of size < r. For any fixed r, LP+L, is solvable 
in polynomial time: higher r leads to improved accuracy 
but higher runtime. Most earlier work considered the case 
r = 2, though recently there has been progress in under- 


standing conditions for tightness for LP+IL3 (Weller et al. 
2016} 2016bja). 


Here we significantly improve on the result for L3 o 
(2016), and provide important new results for when 
LP+L, is guaranteed to be tight, employing an interesting 
geometric perspective. Our main contributions are summa- 
rized in 41.3} We first develop the background and context, 
see for a more extensive survey. 

Most previous work considers separately two different 
types of restricted settings that guarantee tightness, either: 
(i) constraining the potentials to particular families; or (ii) 
placing structural restrictions on the topology of connec- 
tions between variables. As an example of the first type 
of restriction, it is known that if all edge potentials are at- 
tractive (equivalently, if all cost functions are submodular), 
then the basic relaxation LP+L» is tight. In fact, 
showed that for discrete models with 
variables with any finite number of labels, and potentials of 
any order: if no restriction is placed on topology, then for 
a given family of potentials, either the LP relaxation on the 
natural local polytope is always tight, and hence solves all 
such problems efficiently, or the problem set is NP-hard. 


1.1 Treewidth, Minors and Conditions for Tightness 


Exploring the class of structural restrictions, 
showed that subject to mild as- 
sumptions, if no restriction is placed on types of poten- 
tials, then the structural constraint of bounded treewidth is 
needed for tractable marginal inference||| Indeed, 
proved that if a model topology 
has treewidth < r — 1 then this is sufficient to guarantee 
tightness for LP+L,.. As a well-known simple example, if 
a connected model has treewidth 1 (equivalent to being a 
tree), then the standard relaxation LP+L% is always tight. 


The graph property of treewidth < r—1 is closed under tak- 


ing minors (definitions in examples in Figures [I] and 


2), hence by the celebrated graph minor theorem ( 

(2004), the property may be characterized 
by forbidding a unique finite set of minimal forbidden mi- 
nors. Said differently, of all the graphs with treewidth > r, 
there is a unique finite set T, of graphs which are mini- 
mal with respect to minor operations. Hence, the sufficient 


condition of|Wainwright and Jordan|(2004) for tightness of 


LP+L, may be reframed as: if the graph of a model does 


'The treewidth of a graph is one less than the minimum size of 
a maximum clique in a triangulation of the graph, as used in the 
junction tree algorithm. Marginal inference seeks the marginal 
probability distribution for a subset of variables, which is typically 
harder than MAP inference. 


not contain any graph in the set T, as a minor, then LP+L, 
is guaranteed to be tight for any potentials. 


The relevant sets of forbidden minors for Lə and Lg are 
particularly simple with just one member each: To = {K3} 
and T3 = {K4}, where Kn is the complete graph on n 
vertices. For higher values of r, T, always contains K, 
but there are also other forbidden minors, and their number 
grows rapidly: T4 has 4 members (Arnborg et al.] 
while T; has over 70 1993). 


(2016a) showed that, for any r, the graph property 
of LP+L, being tight for all valid potentials on the graph 


is also closed under taking minors. Hence, by Robertson- 
Seymour, the property for LP+L, may be characterized by 
forbidding a unique set of minimal forbidden minors U,. It 
was shown that, in fact, U2 = Ty = {K3} and U3 = T; = 
{K4}. However, until this work, all that was known about 
U4 is that it contains the complete graph Ks: it has been an 
open question whether or not U4 = T4. 


One of our main contributions here is to show that U, # 
T4. Indeed, in {5]we show that U4 N T4 = {Ks} and that 
U4 must contain at least one other forbidden minor, which 
we cannot yet identify. This progress on understanding U4 
is a significant theoretical development, demonstrating that 
in general, treewidth is not precisely the right way to char- 
acterize tightness of LP relaxations. 


Whereas (2016a) made extensive use of powerful 


earlier results in combinatorics in order to identify U3, in- 
cluding two results which won the prestigious Fulkerson 
prize (2001), our analysis takes 
a different, geometric approach (developed in 4] and g5}. 
which may be of independent interest. 


1.2 Stronger Hybrid Conditions 


Throughout we considered only the graph topology 
of a model’s edge potentials. If we also have access to the 
signs of each edge (attractive or repulsive), then stronger 
results may be derived. By combining restrictions on both 
classes of potentials and structure, these are termed hybrid 


conditions (Cooper and Zivny||2011). 


In this direction, showed that for a signed 
graph, the property that LP+L, is tight for all valid poten- 
tials (now respecting the graph structure and edge signs), 
is again closed under taking minors, hence again may be 
characterized by a finite set of minimal forbidden minors 
U!. Further, Weller showed that for r = 2 and r = 3, 
the forbidden minors are precisely only the odd versions 
of the forbidden minors for a standard unsigned graph, 
where an odd version of a graph G means the signed ver- 
sion of G where every edge is repulsive (a repulsive edge 
is sometimes called odd). That is, US = {odd-k3} and 
Us, = {odd-k4}. To see the increased power of these re- 
sults, observe for example that this means that LP+Ls3 is 
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tight for any model of any treewidth, even if it contains K4 
minors, provided only that it does not contain the particu- 
lar signing of K4 where all edges are odd (or an equivalent 
resigning thereof, see §5. 1p. 


In we show somewhat similarly that of all possible 
signings of Ks, it is only an odd-K’5 which leads to non- 
tightness of LP+L4. 


1.3 Main Contributions 


Given the background in qI-I]and here we highlight 
key contributions. 


In 83| we significantly strengthen the result of|Weller et al. 
okey 


) for an almost balanced model (which contains a dis- 
tinguished variable s.t. removing it renders the model bal- 
anced, see {3.1]for precise definitions). For such a model, it 
was shown that LP+L3 is always tight. Here we show that 
we may relax the polytope from Ls to a variant we call L3, 
which enforces triplet constraints only for those triplets in- 
cluding the one distinguished variable s, while still guaran- 
teeing tightness. This has important practical implications 
since we are guaranteed tightness with dramatically fewer 
linear constraints, and thus much faster runtime. 


We also show in 3] that for any model (no restriction on 
potentials), enforcing all triplet constraints of L3 is equiv- 
alent to enforcing only those involving the edges of any 
triangulation (chordal envelope) of its graph, which may 
significantly improve runtime. 


In §4| we introduce a geometric perspective on the tightness 
of LP relaxations, which may be of independent interest. 


In 95| we use these geometric methods to provide powerful 
new conditions on the tightness of LP+IL4. These show 
that the relationship which holds between forbidden minors 
characterizing treewidth and LP+L,. tightness for r = 2 
and r = 3 breaks down for r = 4, hence demonstrating that 
treewidth is not precisely the right condition for analyzing 
tightness of higher-order LP relaxations. 


1.4 Related Work 


We discuss related work throughout the text. To our knowl- 


edge, aside from (Weller |2016a), there is little prior work 


which considers conditions on signed minors for inference 


in graphical models. (2011) derives a similar 


characterization to identify when belief propagation has a 
unique fixed point. 


2 BACKGROUND AND PRELIMINARIES 


2.1 The MAP Inference Problem 


A binary pairwise graphical model is a collection of ran- 
dom variables (X;);ev, each taking values in {0, 1}, such 


that the joint probability distribution may be written in a 


minimal representation (Wainwright and Jordan\|2008) as 


exp X bizi + 5 Wij rix; ; (1) 


ieV ij€E 


for some potentials 0; € R for alli € V, and W;; € R 
for all ij € E C V). We identify the topology of the 
model as the graph G = (V, E). When W;, > 0, there is a 
preference for X; and X; to take the same setting and the 
edge ij is attractive, when W;; < 0, the edge is repulsive 


(see 2014} §2 for details). 


A fundamental problem for graphical models is maximum 
a posteriori (MAP) inference, which asks for the identifica- 
tion of a most likely joint state of all the random variables 
(X;)iev under the probability distribution specified in (I). 


The MAP inference problem is clearly equivalent to maxi- 
mizing the argument of the exponential in (ip, yielding the 
following integer quadratic program: 


max 
xeE{0,1}Y 


iEV 


ijE€E 
2.2 LP Relaxations for MAP Inference 


A widely used approach to solving Problem is first to 
replace the integer programming problem with an equiva- 
lent linear program (LP), and then to optimize the objective 
over a relaxed polytope with a polynomial number of linear 
constraints. This leads to an LP which is efficient to solve 
but may return a fractional solution (in which case, branch- 
ing or cutting approaches are often used in practice). 


In detail, from Problem[2] an auxiliary variable x;; = £i£j 
is introduced for each edge, as in Problem B): 


X 6:0; + 5 Witi ‘ (3) 


ieV ijEE 


max 
xe{0,1} VU" 

Tij =n,0jVIJEE 

With the objective now linear, an LP may be formed by op- 

timizing over the convex hull of the optimization domain of 

Problem (3). This convex hull is called the marginal poly- 


tope (Wainwright and Jordan} |2008) and denoted M(G). 


Thus we obtain the equivalent problem: 


X 6a + X Wagi) - (4) 


ieV ijEE 


max 
qEM(G) 


This LP is in general no more tractable than Problem 
B. due to the number of constraints needed to describe 
M(G). It is standard to obtain a tractable problem by relax- 
ing the domain of optimization M(G') to some larger poly- 
tope L which is easier to describe. The resulting relaxed 
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optimization yields an upper bound on the optimal value of 
Problem (2), though if the arg max over L is an extremal 
point of M(G), then the approximation must be exact, and 
we Say that the relaxation is tight for this problem instance. 


We focus on the family of relaxations introduced by [Sher] 
ali and Adams) (1990), defined below. We first consider a 


probabilistic interpretation of the marginal polytope. 
Notation. For any finite set Z, let Y(Z) be the set of 
probability measures on Z. 
The marginal polytope can then be written 

M(G) = {((qiiev, (Giz )igen) | Su € A({O, 1}") 
P(X, =1)VieV, (5) 
P(X; = 1, X; = 1) Vij € E}. 


s.t. qi = 
qij = 


Intuitively the constraints of the marginal polytope en- 
force a global consistency condition for the set of param- 
eters ((qiJiev, (qij)ijem), so that together they describe 
marginal distributions of some global distribution over the 
entire set of variables. A natural approach is to relax this 
condition of global consistency to a less stringent notion of 
consistency only for smaller clusters of variables. 
Definition 1. The Sherali-Adams polytope L, (G) of order 
r for a binary pairwise graphical model on G = (V, E) is 


L, (G) = {((a)ievs (Gy )izen) | Ya C V with |a| < r, 

Jra E€ A({0, 1}) s.t. Talg = T6 V8 C a, and 

Tii j} (l, 1) = ij Vij E E, Tr} (1) = qi Vi € V}, 
(6) 


where for 6 C a, Ta | B denotes the marginal distribution of 
Ta on {0, 1}4. 


Considering successive fixed values of r € N, the Sherali- 
Adams polytopes therefore yield the following sequence of 
tractable approximations to Problem (2: 


max iqi + Wijqij | - (7) 


As r is increased, so too does the number of linear con- 
straints required to define L,., leading to a tighter polytope 
and a more accurate solution, but at the cost of greater com- 
putationally complexity. For r = |V|, L,(G) is exactly 
equal to the marginal polytope M(G). 


A fundamental question concerning LP relaxations for 
MAP inference is when it is possible to use a computa- 
tionally cheap relaxation L,(G) and still obtain an exact 
answer to the original inference problem. This is of great 
practical importance, as it leads to tractable algorithms for 
particular classes of problems that in full generality are NP- 
hard. In this paper we investigate this question for a vari- 
ety of problem classes for the Sherali-Adams relaxations: 


L(G), L3(G), and L4(G); also referred to as the local, 
triplet, and quad polytopes respectively. 


3 REFINING TIGHTNESS RESULTS 
FOR L;(G) 


Here we first derive new results for the triplet polytope 
LL; that significantly strengthen earlier work (Weller et al. 
for almost balanced models. Then in oy we 
demonstrate that for any model, triplet constraints need be 


applied only over edges in some triangulation of its graph 
(that is, over any chordal envelope). 


3.1 Graph-theoretic Preliminaries 


A signed graph is a graph G = (V, E) together with a 
function © : E — {even, odd}, where attractive edges are 
even and repulsive edges are odd. A signed graph is bal- 
anced if its vertices V may be partitioned 
into two exhaustive sets V1, V2 such that all edges with both 
endpoints in V; or both endpoints in Vz are even, whilst all 
edges with one endpoint in each of V; and Vz is odd. 


A signed graph G = (V, E, X) is almost balanced (Weller 
if there exists some distinguished vertex s € V such 
that removing s leaves the remainder balanced (thus any 
balanced graph is almost balanced). To detect if a graph 
is almost balanced, and if so then to find a distinguished 
vertex, may be performed efficiently (simply hold out one 
variable at a time and test the remainder to see if it is bal- 
anced, [Harary and Kabell]/1980). A graphical model is al- 
most balanced if the signed graph corresponding to its edge 
potentials is almost balanced. 


3.2 Tightness of LP Relaxations for Almost Balanced 
Models 


The following result was shown by|Weller et al.|(2016). 


Theorem 2. Given a graph G, the triplet polytope L3(G) 
is tight on the class of almost balanced models on G. 


We present a significant strengthening of Theorem [2] in a 
new direction, which identifies exactly which constraints 
of L3(G) are required to ensure tightness for the class of 
almost balanced models. 


Given an almost balanced model with distinguished vari- 
able s (that is, deleting s renders the model balanced), de- 
fine the polytope L(G) by taking all the pairwise con- 
straints of L2(G), and adding triplet constraints only for 
triplets of variables that include s (see q10.4. I]in the Sup- 
plement for more details). 


L3(G) is a significant relaxation of L3(G), requiring only 
O(|V|?) linear constraints, rather than the O(|V|?) con- 
straints needed for L3(G). Thus it is substantially faster to 
optimize over L3(G). Nevertheless, our next result shows 
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that L(G) is still tight for an almost balanced model, and 
indeed is the ‘loosest’ possible polytope with this property. 


Theorem 3. The polytope L3(G) is tight on the class of al- 
most balanced models on graph G with distinguished vari- 
able s. Further, no linear constraint of L3(G) may be re- 
moved to yield a polytope which is still tight on all models 
in this class of potentials. Proof in the Supplement 


Considering cutting plane methods, Theorem |3| demon- 
strates exactly which constraints from L3(G) may be nec- 
essary to add to the polytope L2(G) in order to achieve 
tightness for an almost balanced model. 


3.3 Chordality and Extending Partial Marginals 


By its definition, the polytope Lə enforces pairwise con- 
straints on every pair of variables in a model, whether or 
not they are connected by an edge, yet typically one en- 
forces constraints only for edges Æ in the model. This is 
sufficient because it is not hard to see that if one has edge 
marginals for the graph G(V, E), it is always possible to 
extend these to edge marginals for the complete graph on 
V while remaining within Lə (and the values of these addi- 
tional marginals are irrelevant to the score by assumption). 


Here we provide an analogous result for L3, which shows 
that for any model (no restriction on potentials), one 
need only enforce triplet constraints over any triangulation 
(chordal envelope) of its graph. 


Theorem 4. For a chordal graph G, the polytope L2(G) 
together with the triplet constraints only for those triplets 
of variables that form 3-cliques in G, is equal to L3(G), i.e. 
the polytope given by enforcing constraints on all triplets 
of G. Proof in the Supplement g9] 


4 THE GEOMETRY OF 
SHERALI-ADAMS RELAXATIONS 


Here we introduce several geometric notions for the 
Sherali-Adams polytopes which we shall apply in 


4.1 The Geometry of the Sherali-Adams Polytopes 


The study of tightness of LP relaxations is naturally ex- 
pressed in the language of polyhedral combinatorics. We 
introduce key notions from the literature, then provide new 
proofs of characterizations of tightness for the local poly- 
tope L2(G) with these geometric ideas. 


Given a polytope P C R”, and an extremal point (vertex) 
v € Ext(P), the normal cone to P at v, denoted Np(v), is 
the polyhedral cone defined by 


Np(v) = {ee Rv € argmax (c,2)}. (8) 


We define the conical hull of a finite set X as: Cone(X) = 
{Z zex Act | Ax > OVa € X}. The following character- 
ization of normal cones will be particularly useful. 


Lemma 5 (Theorem 2.4.9, (1993). Let P = 
{x e R™|Ax < b} be a polytope for some A = 
[a1,...,a%]' € R**™, b € R* (for some k € N). Then 
for v € Ext(P), we have 


Np(v) = Cone({a;| (ai, v) = b}). (9) 


Further, if the representation {x € R™|Ax < b} has no 
redundant constraints, then {a;| (ai, v) = b} is a complete 
set of extremal rays of Np(v) (up to scalar multiplication). 


With these geometric notions in hand, we have a succinct 
characterization of the set of potentials for which a given 
Sherali-Adams relaxation is tight. 


Lemma 6. The set of potentials which are tight with re- 
spect to L, is exactly given by the following union of cones 


U Me). (10) 


vEExt(M(G)) 


This concise characterization, together with the explicit 
parametrization of normal cones given by Lemma |5| and 
the form of the linear constraints defining the local poly- 
tope L2(G), yields an efficient algorithm for generating ar- 
bitrary potentials which are tight with respect to L2(G). 


We would like also to identify classes of potentials for 
which L(G) is guaranteed to be tight. We demonstrate the 
power of our geometric approach by providing new proofs 
in {7jof the Supplement of the following earlier results. 
Lemma 7. IfG is a tree, then L(G) is tight for all poten- 
tials, that is L2(G) = M(G). 

Lemma 8. For an arbitrary graph G, L(G) is tight for 
the set of balanced models. 


The proofs proceed by explicitly demonstrating that a given 
potential lies in a cone Ng, (a) (v) for some vertex v of the 
marginal polytope, by expressing the potential as a conical 
combination of the extremal rays of the cone. 


4.2 The Symmetry of the Sherali-Adams Polytopes 


The Sherali-Adams polytopes have rich symmetries which 
can be exploited when classifying tightness of LP relax- 
ations using the tools discussed in Intuitively, these 
symmetries arise either by considering relabellings of the 
vertices of the graph G (permutations), or relabellings of 
the state space of individual variables (flippings). The key 
result is that a Sherali-Adams polytope is tight a for poten- 
tial c iff it is tight for any permutation or flipping of c. 


4.2.1 The Permutation Group 


Let o € Sy be a relabeling of the vertices of the graph 
G. This permutation then induces a bijective map Y, : 


Conditions Beyond Treewidth for Tightness of Higher-order LP Relaxations 


L,(G) > L,(G) (which naturally lifts to a linear map on 
IRV), given by applying the corresponding relabeling to 
the components of the pseudomarginal vectors: 


Yo ((qiJiev, (Gig igen) = ((do(a) liev, (dolio) lijer) 
V((qiJiev, (Giz igen) € Lr (G). 


The element o € Sy also naturally induces a linear map 
on the space of potentials, which is formally the dual space 
(RYV")*, although we will frequently identify it with 
RYYE, We denote the map on the space of potentials by 
Yj : (RYY%)* = (RVYE)*, given by 


YÀ ((ciJiev, (cas gen) = ((Colay)iev, (Coots) ize) 
V((ciiev, (Cig )ijee) € RY. 


The sets {Y,|o € Sy} and {Y} |o € Sy} obey the group 
axioms (under the operation of composition), and hence 
form groups of symmetries on L, (G') and (RV Y”)* respec- 
tively; they are both naturally isomorphic to Sy. 


These symmetry groups form a useful formalism for think- 
ing about tightness of Sherali-Adams relaxations. We pro- 
vide one such result in this language, proof in 8] of the 
Supplement. 


Lemma 9. L,.(G) is tight for a given potential c € RYU” 


iff it is tight for all potentials Y} (c), o € Sy. 


4.2.2 The Flipping Group 


Whilst the permutation group described above corresponds 
to permuting the labels of vertices in the graph, it is also 
useful to consider the effect of permuting the labels of the 
states of individual variables. In the case of binary models, 
the label set is {0,1}, so permuting labels corresponds to 
switching 0 + 1, which we refer to as ‘flipping’. Given 
a variable v € V, define the affine map F(,) : Ree = 
IRYY which acts on any pseudomarginal q € RY YF to flip 


v as follows (see|Weller et al.|/2016)for details): 
[Foa], =1- a, 
[Fwy (q)| vw = dw — ww, Www € E, 


while F) leaves unchanged all other coordinates of q. 
Note that Fi») restricts to a bijection on L, (G). The flip- 
ping maps commute and have order 2, hence the group gen- 
erated by them, (Falu € V), is isomorphic to a, A 
general element of this group can be thought of as simul- 
taneously flipping a subset J C V of variables, written as 
Fir) : L,(G) => L,(G). 


Flipping a subset of variables J C V also naturally induces 
a map Fly : (RYYE)* — (RYYF)* on the space of po- 
tentials. We give a full description of this map in {Sof the 
Supplement. An analogous result to Lemma [9] also holds 


for the group of flipping symmetries. 
Lemma 10. L,.(G) is tight for a given potential c € RYYE 
iff it is tight for all potentials Fa ICY. 


Figure 1: The left graph is a minor (unsigned) of the right graph, 
obtained by deleting the grey dotted edges and resulting isolated 
grey vertex, and contracting the purple wavy edge. See 


4.2.3 The Joint Symmetry Group of the 
Sherali-Adams Polytopes 


Tying together the remarks of §4.2.1]and 94.2.2} note that 
in general the symmetries of the flipping and permutation 
groups on L, (G) do not commute. In fact, observe that 
Y7! o Fin © Ys = Fio-1(1) - (11) 
Thus the group of symmetries of L, (G) generated by per- 
mutations and flippings is isomorphic to the semidirect 


product Sy x a, 


5 FORBIDDEN MINOR CONDITIONS 
FOR TIGHTNESS OF L(G) 


We first introduce graph minors and their application to the 
characterizations of both treewidth and tightness of LP re- 
laxations over L,. While these characterizations are the 
same for r < 3, a key contribution in this section is to use 
the geometric perspective of {4]to show that the character- 
izations are not the same for r = 4. 


5.1 Graph Minor Theory 


For further background, see Chapter 12). 
Given a graph G = (V, E), a graph H is a minor of G, 
written H < G, if it can be obtained from G via a se- 
ries of edge deletions, vertex deletions, and edge contrac- 
tions. The result of contracting an edge uv € E is the graph 
G'(V', E’) where u and v are ‘merged’ to form a new ver- 
tex w which is adjacent to any vertex that was previously 
adjacent to either u or v. That is V’ = V \ {u,v} U {w}, 
and BE’ = {ec E | u,v g e}U{ wrz | ux € E or vg € E}. 
This is illustrated in Figure [1] 


A property of a graph is closed under taking minors or 
minor-closed if whenever G has the property and H < G, 
then H also has the property. 


The celebrated graph minor theorem of (Robertson and Sey- 
(2004) proves that any minor-closed graph property 


may be characterized by a unique finite set {H1,..., Hm} 
of minimal (wrt minor operations) forbidden minors; that 
is, a graph G has the property iff it does not contain any 
H; as a minor. Checking to see if a graph contains some 
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Figure 2: The left graph is a signed minor of the right signed 
graph, obtained similarly to Figure[I]except that before contract- 
ing the repulsive edge, first flip the vertex at its right end. Solid 
blue (dashed red) edges are attractive (repulsive). Grey dotted 
edges on the right are deleted and may be of any sign. See 


H as a minor may be performed efficiently (Robertson andl 
1995). 


We shall also consider signed graphs (see § and their 
respective signed minors. A signed minor of a e graph 
is obtained as before by edge deletion, vertex deletion, and 
edge contraction but now: contraction may be performed 
only on even (attractive) edges; and any resigning operation 
is also allowed, in which a subset of vertices S C V is 
selected then all edges with exactly one end in S are flipped 
even <> odd? An example is shown in Figure This 
notion of flipping is closely related to the notion of flipping 


of potentials, introduced in 94.2.2) 


The graph minor theorem of Robertson and Seymour gen- 


eralizes to signed graphs (Huynh| |2009 
2014): any property of a signed graph which is closed un- 
der taking signed minors, may be characterized by a unique 


finite set of minimal forbidden signed minors. 


5.2 Treewidth Characterizations of Tightness 


A fundamental result in the study of LPs over Sherali- 
Adams relaxations is the following sufficient condition. 


Theorem 11 (Wainwright and Jordan] [2004). If G has 


treewidth < r — 1, then L,(G) = cae equivalently 
LP+L, is tight for all valid potentials on G. 


The goal of this section is to study to what extent a partial 
converse to this result holds; that is, to what extent tightness 
of a Sherali-Adams polytope can hold for graphs of high 
treewidth. We focus on the case of L4(G), and proceed 
based on the graph minor properties of G. 


See {I.1] for a quick review of results that the properties, 
for any r, of treewidth< r — 1, and of LP+L,. a (Wale tight 
for soe valid potentials, are both minor-closed 
. Thus, by the graph minor theorem, both are wam 


?Hence, to contract an odd edge, one may first flip either end 
of the edge to make it even, then contract. In our context of binary 
pairwise models, flipping a subset S is equivalent to switching 
from a model with variables {X;} to a new model with variables 
{Y¥,: Yı = 1 — X; Vie S, Y; = X; Vi € V \ S} and setting 
potentials to preserve the distribution, which flips the sign of Wi; 
for any edge ij with one end in S; details in §2.4). 


Figure 3: Ks (unsigned), the 
only element of T2, the set of 
minimal forbidden minors for 
treewidth < 1. To = Us, 
the set of forbidden minors for 
LP-+Lz to be tight. See 


Figure 4: K4 (unsigned), the 
only element of 73, the set of 
minimal forbidden minors for 
treewidth < 2. T3 = Us, 


the set of forbidden minors for 
LP-+Ls3 to be tight. See 


Q 


(a) Ks (b) Octahedral graph 


P A 


(c) Wagner graph (d) Pentagonal prism graph 


Figure 5: The four members of T4, the set of minimal forbidden 
minors (unsigned) for treewidth < 3. We show the new result 
that T4 A Us, in fact Ts N Us = {K5}, where Us is the set of 
forbidden minors for LP+L4 to be tight. See 


acterized by a finite set of minimal forbidden minors. 


We shall often be interested in the set of potentials y= 
{c € RYYE | sign(ce) = H(e) Ve € E} that are consistent 
with a given signing X of the graph G. We say that the 
polytope L,.(G) is tight for a signing © if LP+L, is tight 
for all potentials in the set Sc RVVZ, 


The property of a signed graph (G, £) that L,.(G) is tight 
for the signing © is also minor-closed [201 6a), 
hence can also be characterized by forbidden minimal 
signed minors. 


Recalling the notation introduced in for unsigned 
graphs, we call the respective sets of minimal forbidden 
minors T, (for treewidth < r — 1) and U, (for LP tightness 
over L,.); for signed graphs, U’. is the set of minimal for- 
bidden signed minors for tightness of LP+L,. It is known 
that in fact, T, = U, for r = 2,3, and that 
U! is exactly the set of odd signings (i.e. signings where 
all edges of the graph are odd/repulsive) of the graphs of 
T, for r = 2,3. We shall show in 45.3] that both of these 
relationships break down for r = 4. 


The sets T2, T3 and T4 are shown in Figures |3| ie 5} whilst 
the sets U; and U3 are shown in Figures [6]and[7] 


Conditions Beyond Treewidth for Tightness of Higher-order LP Relaxations 


Figure 6: Odd-K3, the unique 
element of U3, the set of min- 
imal forbidden signed minors 
for tightness of L2(G). Red 
dashed edges represent repul- 


sive edges. See 


Figure 7: Odd-K4, the unique 
element of U3, the set of min- 
imal forbidden signed minors 
for tightness of L3(G). Red 
dashed edges represent repul- 


sive edges. See 


5.3 Identifying Forbidden Signed Minors 


L4(G) is tight for a potential c € RYY* if and only if 


= 12 
pes (c, q) Bs (Ga) (12) 

or equivalently if 
max min [(c,qg) — (c, x)| =0. (13) 


q€La(G) «€M(G) 


Since M(G) C La(G), we have maxger,(g) (6q) >= 
maxsem(G) (c,z) We € RYY*. Hence, it follows that 
L4(G') is not tight for some potential c € © (the set of 
potentials respecting a signing © of G, see iff the fol- 
lowing optimization problem has a non-zero optimum: 


max max min [(c,q) —(c,2)]. 


(14) 
cess qEL4(G) ceEM(G) 


For the graphs in 74, this is a high-dimensional indefi- 
nite quadratic program which is intractable to solve. How- 
ever, using the geometric ideas of 94| we may decompose 
this problem into a sequence of tractable linear programs. 
This process involves computing vertex representations (V- 
representations) for a variety of polytopes using the ideas 
of §4| and computing the orbits of the set of signings of G 
under the natural action of the group described in 
see the Supplement 11] for full details. Solving these lin- 
ear programs then allows the exact set of non-tight signings 
of the graphs in Figure[5]to be identified; see Figure [8] 


Theorem 12. The only non-tight signing for L4 of any min- 
imal forbidden minor for treewidth < 3 is the odd-K5. 


5.4 Discussion: Other Forbidden Minors 


Previous work showed that tightness for all valid poten- 
tials of LP+Lə may be characterized exactly by forbidding 
just an odd- as signed minor, and that a similar result 
for LP+L3 holds by forbidding just an odd-k4 
[2016a). These are precisely the odd versions of the forbid- 
den minors for the respective treewidth conditions. 


A natural conjecture for L4 was that one must forbid just 
some signings of the four graphs in Ty, see Figure[5| Now 
given Theorem [12] it would seem sensible to wonder if 


Figure 8: Odd-Ks, the unique signing of an element of T4, the 
set of minimal forbidden unsigned minors for treewidth < 3, that 
appears in U4, the set of minimal forbidden signed minors for 
La(G), as shown by our new Theorem It was previously 
known that for L,,r < 3, the minimal forbidden signed minors 
for LP-tightness are exactly the odd versions of the minimal for- 
bidden unsigned minors for treewidth < r — 1. We believe there 
must be at least one other forbidden minor for tightness of L4, see 
45.4) Red dashed edges represent repulsive edges. 


LP+L; is tight for all valid potentials iff a model’s graph 
does not contain an odd-K5? 


However, this must be false (unless P=NP), since if it were 
true: We would have LP+L, is tight for any model not 
containing Ks (as an unsigned minor). It is well-known 
that planar graphs are those without K5 or A’3 3 as a minor 
(K3,3 is the complete bipartite graph where each partition 
has 3 vertices), i.e. a subclass of graphs which are K‘5-free. 
Hence, we would have a polytime method to solve MAP 
inference for any planar binary pairwise model. Yet it is 
not hard to see that we may encode minimal vertex cover in 
such a model, and it is known that planar minimum vertex 


cover is NP-hard 1982). 


We have not yet been able to identify any other minimal 
forbidden minor for tightness of LP+IL4, but note that one 
natural candidate is some signing of a k x k grid of suffi- 
cient size, since this is planar with treewidth k. 


6 CONCLUSION 


LP relaxations are widely used for the fundamental task of 
MAP inference for graphical models. Considering binary 
pairwise models, we have provided important theoretical 
results on when various relaxations are guaranteed to be 
tight, which guarantees that in practice, an exact solution 
can be found efficiently. 


A key result focuses on the connection between tightness 
of LP relaxations of a model and the treewidth of its graph. 
For the first two levels of the Sherali-Adams hierarchy, that 
is for the pairwise and triplet relaxations, it was known that 
the characterizations are essentially identical. However, we 
have shown that this pattern does not hold for the next level 
in the hierarchy, that is for the quadruplet polytope L4. 


We refined this result by considering the signed graph of a 
model and its signed minors. To derive these results we in- 
troduced geometric methods which may be of independent 
interest. 
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